-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[AMDGPU] using loop to define data type convert patterns #132899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: None (Shoreshen) Changesusing loop to define data type convert patterns Patch is 1.80 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132899.diff 17 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 900aed5b3f994..31b8b4254c4c4 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -1592,361 +1592,249 @@ foreach Index = 0-31 in {
// FIXME: Why do only some of these type combinations for SReg and
// VReg?
// 16-bit bitcast
-def : BitConvert <i16, f16, VGPR_32>;
-def : BitConvert <f16, i16, VGPR_32>;
-def : BitConvert <f16, bf16, VGPR_32>;
-def : BitConvert <bf16, f16, VGPR_32>;
-
-def : BitConvert <i16, f16, SReg_32>;
-def : BitConvert <f16, i16, SReg_32>;
-def : BitConvert <f16, bf16, SReg_32>;
-def : BitConvert <bf16, f16, SReg_32>;
+foreach vt = Reg16Types.types in {
+ foreach st = Reg16Types.types in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VGPR_32>;
+ }
+ }
+}
-def : BitConvert <i16, bf16, VGPR_32>;
-def : BitConvert <bf16, i16, VGPR_32>;
-def : BitConvert <i16, bf16, SReg_32>;
-def : BitConvert <bf16, i16, SReg_32>;
+foreach vt = Reg16Types.types in {
+ foreach st = Reg16Types.types in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_32>;
+ }
+ }
+}
// 32-bit bitcast
-def : BitConvert <i32, f32, VGPR_32>;
-def : BitConvert <f32, i32, VGPR_32>;
-def : BitConvert <i32, f32, SReg_32>;
-def : BitConvert <f32, i32, SReg_32>;
-def : BitConvert <v2i16, i32, SReg_32>;
-def : BitConvert <i32, v2i16, SReg_32>;
-def : BitConvert <v2f16, i32, SReg_32>;
-def : BitConvert <i32, v2f16, SReg_32>;
-def : BitConvert <v2i16, v2f16, SReg_32>;
-def : BitConvert <v2f16, v2i16, SReg_32>;
-def : BitConvert <v2f16, f32, SReg_32>;
-def : BitConvert <f32, v2f16, SReg_32>;
-def : BitConvert <v2i16, f32, SReg_32>;
-def : BitConvert <f32, v2i16, SReg_32>;
-def : BitConvert <v2bf16, i32, SReg_32>;
-def : BitConvert <i32, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, i32, VGPR_32>;
-def : BitConvert <i32, v2bf16, VGPR_32>;
-def : BitConvert <v2bf16, v2i16, SReg_32>;
-def : BitConvert <v2i16, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, v2i16, VGPR_32>;
-def : BitConvert <v2i16, v2bf16, VGPR_32>;
-def : BitConvert <v2bf16, v2f16, SReg_32>;
-def : BitConvert <v2f16, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, v2f16, VGPR_32>;
-def : BitConvert <v2f16, v2bf16, VGPR_32>;
-def : BitConvert <f32, v2bf16, VGPR_32>;
-def : BitConvert <v2bf16, f32, VGPR_32>;
-def : BitConvert <f32, v2bf16, SReg_32>;
-def : BitConvert <v2bf16, f32, SReg_32>;
+foreach vt = Reg32DataTypes.types in {
+ foreach st = Reg32DataTypes.types in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VGPR_32>;
+ }
+ }
+}
+
+foreach vt = Reg32DataTypes.types in {
+ foreach st = Reg32DataTypes.types in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_32>;
+ }
+ }
+}
// 64-bit bitcast
-def : BitConvert <i64, f64, VReg_64>;
-def : BitConvert <f64, i64, VReg_64>;
-def : BitConvert <v2i32, v2f32, VReg_64>;
-def : BitConvert <v2f32, v2i32, VReg_64>;
-def : BitConvert <i64, v2i32, VReg_64>;
-def : BitConvert <v2i32, i64, VReg_64>;
-def : BitConvert <i64, v2f32, VReg_64>;
-def : BitConvert <v2f32, i64, VReg_64>;
-def : BitConvert <f64, v2f32, VReg_64>;
-def : BitConvert <v2f32, f64, VReg_64>;
-def : BitConvert <f64, v2i32, VReg_64>;
-def : BitConvert <v2i32, f64, VReg_64>;
-def : BitConvert <v4i16, v4f16, VReg_64>;
-def : BitConvert <v4f16, v4i16, VReg_64>;
-def : BitConvert <v4bf16, v2i32, VReg_64>;
-def : BitConvert <v2i32, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, i64, VReg_64>;
-def : BitConvert <i64, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, v4i16, VReg_64>;
-def : BitConvert <v4i16, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, v4f16, VReg_64>;
-def : BitConvert <v4f16, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, v2f32, VReg_64>;
-def : BitConvert <v2f32, v4bf16, VReg_64>;
-def : BitConvert <v4bf16, f64, VReg_64>;
-def : BitConvert <f64, v4bf16, VReg_64>;
-
-
-// FIXME: Make SGPR
-def : BitConvert <v2i32, v4f16, VReg_64>;
-def : BitConvert <v4f16, v2i32, VReg_64>;
-def : BitConvert <v2i32, v4f16, VReg_64>;
-def : BitConvert <v2i32, v4i16, VReg_64>;
-def : BitConvert <v4i16, v2i32, VReg_64>;
-def : BitConvert <v2f32, v4f16, VReg_64>;
-def : BitConvert <v4f16, v2f32, VReg_64>;
-def : BitConvert <v2f32, v4i16, VReg_64>;
-def : BitConvert <v4i16, v2f32, VReg_64>;
-def : BitConvert <v4i16, f64, VReg_64>;
-def : BitConvert <v4f16, f64, VReg_64>;
-def : BitConvert <f64, v4i16, VReg_64>;
-def : BitConvert <f64, v4f16, VReg_64>;
-def : BitConvert <v4i16, i64, VReg_64>;
-def : BitConvert <v4f16, i64, VReg_64>;
-def : BitConvert <i64, v4i16, VReg_64>;
-def : BitConvert <i64, v4f16, VReg_64>;
-
-def : BitConvert <v4i32, v4f32, VReg_128>;
-def : BitConvert <v4f32, v4i32, VReg_128>;
+foreach vt = Reg64DataTypes.types in {
+ foreach st = Reg64DataTypes.types in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_64>;
+ }
+ }
+}
+
// 96-bit bitcast
-def : BitConvert <v3i32, v3f32, SGPR_96>;
-def : BitConvert <v3f32, v3i32, SGPR_96>;
+foreach vt = SGPR_96.RegTypes in {
+ foreach st = SGPR_96.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SGPR_96>;
+ }
+ }
+}
+
// 128-bit bitcast
-def : BitConvert <v2i64, v4i32, SReg_128>;
-def : BitConvert <v4i32, v2i64, SReg_128>;
-def : BitConvert <v2f64, v4f32, VReg_128>;
-def : BitConvert <v2f64, v4i32, VReg_128>;
-def : BitConvert <v4f32, v2f64, VReg_128>;
-def : BitConvert <v4i32, v2f64, VReg_128>;
-def : BitConvert <v2i64, v2f64, VReg_128>;
-def : BitConvert <v2f64, v2i64, VReg_128>;
-def : BitConvert <v4f32, v2i64, VReg_128>;
-def : BitConvert <v2i64, v4f32, VReg_128>;
-def : BitConvert <v8i16, v4i32, SReg_128>;
-def : BitConvert <v4i32, v8i16, SReg_128>;
-def : BitConvert <v8f16, v4f32, VReg_128>;
-def : BitConvert <v8f16, v4i32, VReg_128>;
-def : BitConvert <v4f32, v8f16, VReg_128>;
-def : BitConvert <v4i32, v8f16, VReg_128>;
-def : BitConvert <v8i16, v8f16, VReg_128>;
-def : BitConvert <v8f16, v8i16, VReg_128>;
-def : BitConvert <v4f32, v8i16, VReg_128>;
-def : BitConvert <v8i16, v4f32, VReg_128>;
-def : BitConvert <v8i16, v8f16, SReg_128>;
-def : BitConvert <v8i16, v2i64, SReg_128>;
-def : BitConvert <v8i16, v2f64, SReg_128>;
-def : BitConvert <v8f16, v2i64, SReg_128>;
-def : BitConvert <v8f16, v2f64, SReg_128>;
-def : BitConvert <v8f16, v8i16, SReg_128>;
-def : BitConvert <v2i64, v8i16, SReg_128>;
-def : BitConvert <v2f64, v8i16, SReg_128>;
-def : BitConvert <v2i64, v8f16, SReg_128>;
-def : BitConvert <v2f64, v8f16, SReg_128>;
-
-def : BitConvert <v4i32, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v4i32, SReg_128>;
-def : BitConvert <v4i32, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v4i32, VReg_128>;
-
-def : BitConvert <v4f32, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v4f32, SReg_128>;
-def : BitConvert <v4f32, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v4f32, VReg_128>;
-
-def : BitConvert <v8i16, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v8i16, SReg_128>;
-def : BitConvert <v8i16, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v8i16, VReg_128>;
-
-def : BitConvert <v8f16, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v8f16, SReg_128>;
-def : BitConvert <v8f16, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v8f16, VReg_128>;
-
-def : BitConvert <v2f64, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v2f64, SReg_128>;
-def : BitConvert <v2f64, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v2f64, VReg_128>;
-
-def : BitConvert <v2i64, v8bf16, SReg_128>;
-def : BitConvert <v8bf16, v2i64, SReg_128>;
-def : BitConvert <v2i64, v8bf16, VReg_128>;
-def : BitConvert <v8bf16, v2i64, VReg_128>;
+foreach vt = VReg_128.RegTypes in {
+ foreach st = VReg_128.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_128>;
+ }
+ }
+}
+
+foreach vt = SReg_128.RegTypes in {
+ foreach st = SReg_128.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_128>;
+ }
+ }
+}
// 160-bit bitcast
-def : BitConvert <v5i32, v5f32, SReg_160>;
-def : BitConvert <v5f32, v5i32, SReg_160>;
-def : BitConvert <v5i32, v5f32, VReg_160>;
-def : BitConvert <v5f32, v5i32, VReg_160>;
+foreach vt = VReg_160.RegTypes in {
+ foreach st = VReg_160.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_160>;
+ }
+ }
+}
+
+foreach vt = SReg_160.RegTypes in {
+ foreach st = SReg_160.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_160>;
+ }
+ }
+}
// 192-bit bitcast
-def : BitConvert <v6i32, v6f32, SReg_192>;
-def : BitConvert <v6f32, v6i32, SReg_192>;
-def : BitConvert <v6i32, v6f32, VReg_192>;
-def : BitConvert <v6f32, v6i32, VReg_192>;
-def : BitConvert <v3i64, v3f64, VReg_192>;
-def : BitConvert <v3f64, v3i64, VReg_192>;
-def : BitConvert <v3i64, v6i32, VReg_192>;
-def : BitConvert <v3i64, v6f32, VReg_192>;
-def : BitConvert <v3f64, v6i32, VReg_192>;
-def : BitConvert <v3f64, v6f32, VReg_192>;
-def : BitConvert <v6i32, v3i64, VReg_192>;
-def : BitConvert <v6f32, v3i64, VReg_192>;
-def : BitConvert <v6i32, v3f64, VReg_192>;
-def : BitConvert <v6f32, v3f64, VReg_192>;
+foreach vt = VReg_192.RegTypes in {
+ foreach st = VReg_192.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_192>;
+ }
+ }
+}
+
+foreach vt = SReg_192.RegTypes in {
+ foreach st = SReg_192.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_192>;
+ }
+ }
+}
// 224-bit bitcast
-def : BitConvert <v7i32, v7f32, SReg_224>;
-def : BitConvert <v7f32, v7i32, SReg_224>;
-def : BitConvert <v7i32, v7f32, VReg_224>;
-def : BitConvert <v7f32, v7i32, VReg_224>;
+foreach vt = VReg_224.RegTypes in {
+ foreach st = VReg_224.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_224>;
+ }
+ }
+}
-// 256-bit bitcast
-def : BitConvert <v8i32, v8f32, SReg_256>;
-def : BitConvert <v8f32, v8i32, SReg_256>;
-def : BitConvert <v8i32, v8f32, VReg_256>;
-def : BitConvert <v8f32, v8i32, VReg_256>;
-def : BitConvert <v4i64, v4f64, VReg_256>;
-def : BitConvert <v4f64, v4i64, VReg_256>;
-def : BitConvert <v4i64, v8i32, VReg_256>;
-def : BitConvert <v4i64, v8f32, VReg_256>;
-def : BitConvert <v4f64, v8i32, VReg_256>;
-def : BitConvert <v4f64, v8f32, VReg_256>;
-def : BitConvert <v8i32, v4i64, VReg_256>;
-def : BitConvert <v8f32, v4i64, VReg_256>;
-def : BitConvert <v8i32, v4f64, VReg_256>;
-def : BitConvert <v8f32, v4f64, VReg_256>;
-def : BitConvert <v16i16, v16f16, SReg_256>;
-def : BitConvert <v16f16, v16i16, SReg_256>;
-def : BitConvert <v16i16, v16f16, VReg_256>;
-def : BitConvert <v16f16, v16i16, VReg_256>;
-def : BitConvert <v16f16, v8i32, VReg_256>;
-def : BitConvert <v16i16, v8i32, VReg_256>;
-def : BitConvert <v16f16, v8f32, VReg_256>;
-def : BitConvert <v16i16, v8f32, VReg_256>;
-def : BitConvert <v8i32, v16f16, VReg_256>;
-def : BitConvert <v8i32, v16i16, VReg_256>;
-def : BitConvert <v8f32, v16f16, VReg_256>;
-def : BitConvert <v8f32, v16i16, VReg_256>;
-def : BitConvert <v16f16, v4i64, VReg_256>;
-def : BitConvert <v16i16, v4i64, VReg_256>;
-def : BitConvert <v16f16, v4f64, VReg_256>;
-def : BitConvert <v16i16, v4f64, VReg_256>;
-def : BitConvert <v4i64, v16f16, VReg_256>;
-def : BitConvert <v4i64, v16i16, VReg_256>;
-def : BitConvert <v4f64, v16f16, VReg_256>;
-def : BitConvert <v4f64, v16i16, VReg_256>;
-
-
-def : BitConvert <v8i32, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v8i32, VReg_256>;
-def : BitConvert <v8f32, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v8f32, VReg_256>;
-def : BitConvert <v4i64, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v4i64, VReg_256>;
-def : BitConvert <v4f64, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v4f64, VReg_256>;
-
-
-
-def : BitConvert <v16i16, v16bf16, SReg_256>;
-def : BitConvert <v16bf16, v16i16, SReg_256>;
-def : BitConvert <v16i16, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v16i16, VReg_256>;
-
-def : BitConvert <v16f16, v16bf16, SReg_256>;
-def : BitConvert <v16bf16, v16f16, SReg_256>;
-def : BitConvert <v16f16, v16bf16, VReg_256>;
-def : BitConvert <v16bf16, v16f16, VReg_256>;
+foreach vt = SReg_224.RegTypes in {
+ foreach st = SReg_224.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_224>;
+ }
+ }
+}
+// 256-bit bitcast
+foreach vt = VReg_256.RegTypes in {
+ foreach st = VReg_256.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_256>;
+ }
+ }
+}
+
+foreach vt = SReg_256.RegTypes in {
+ foreach st = SReg_256.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_256>;
+ }
+ }
+}
// 288-bit bitcast
-def : BitConvert <v9i32, v9f32, SReg_288>;
-def : BitConvert <v9f32, v9i32, SReg_288>;
-def : BitConvert <v9i32, v9f32, VReg_288>;
-def : BitConvert <v9f32, v9i32, VReg_288>;
+foreach vt = VReg_288.RegTypes in {
+ foreach st = VReg_288.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_288>;
+ }
+ }
+}
+
+foreach vt = SReg_288.RegTypes in {
+ foreach st = SReg_288.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_288>;
+ }
+ }
+}
// 320-bit bitcast
-def : BitConvert <v10i32, v10f32, SReg_320>;
-def : BitConvert <v10f32, v10i32, SReg_320>;
-def : BitConvert <v10i32, v10f32, VReg_320>;
-def : BitConvert <v10f32, v10i32, VReg_320>;
+foreach vt = VReg_320.RegTypes in {
+ foreach st = VReg_320.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_320>;
+ }
+ }
+}
+
+foreach vt = SReg_320.RegTypes in {
+ foreach st = SReg_320.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_320>;
+ }
+ }
+}
// 320-bit bitcast
-def : BitConvert <v11i32, v11f32, SReg_352>;
-def : BitConvert <v11f32, v11i32, SReg_352>;
-def : BitConvert <v11i32, v11f32, VReg_352>;
-def : BitConvert <v11f32, v11i32, VReg_352>;
+foreach vt = VReg_352.RegTypes in {
+ foreach st = VReg_352.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_352>;
+ }
+ }
+}
+
+foreach vt = SReg_352.RegTypes in {
+ foreach st = SReg_352.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_352>;
+ }
+ }
+}
// 384-bit bitcast
-def : BitConvert <v12i32, v12f32, SReg_384>;
-def : BitConvert <v12f32, v12i32, SReg_384>;
-def : BitConvert <v12i32, v12f32, VReg_384>;
-def : BitConvert <v12f32, v12i32, VReg_384>;
+foreach vt = VReg_384.RegTypes in {
+ foreach st = VReg_384.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_384>;
+ }
+ }
+}
+
+foreach vt = SReg_384.RegTypes in {
+ foreach st = SReg_384.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_384>;
+ }
+ }
+}
// 512-bit bitcast
-def : BitConvert <v32f16, v32i16, VReg_512>;
-def : BitConvert <v32i16, v32f16, VReg_512>;
-def : BitConvert <v32f16, v16i32, VReg_512>;
-def : BitConvert <v32f16, v16f32, VReg_512>;
-def : BitConvert <v16f32, v32f16, VReg_512>;
-def : BitConvert <v16i32, v32f16, VReg_512>;
-def : BitConvert <v32i16, v16i32, VReg_512>;
-def : BitConvert <v32i16, v16f32, VReg_512>;
-def : BitConvert <v16f32, v32i16, VReg_512>;
-def : BitConvert <v16i32, v32i16, VReg_512>;
-def : BitConvert <v16i32, v16f32, VReg_512>;
-def : BitConvert <v16f32, v16i32, VReg_512>;
-def : BitConvert <v8i64, v8f64, VReg_512>;
-def : BitConvert <v8f64, v8i64, VReg_512>;
-def : BitConvert <v8i64, v16i32, VReg_512>;
-def : BitConvert <v8f64, v16i32, VReg_512>;
-def : BitConvert <v16i32, v8i64, VReg_512>;
-def : BitConvert <v16i32, v8f64, VReg_512>;
-def : BitConvert <v8i64, v16f32, VReg_512>;
-def : BitConvert <v8f64, v16f32, VReg_512>;
-def : BitConvert <v16f32, v8i64, VReg_512>;
-def : BitConvert <v16f32, v8f64, VReg_512>;
-def : BitConvert <v8i64, v32f16, VReg_512>;
-def : BitConvert <v8i64, v32i16, VReg_512>;
-def : BitConvert <v8f64, v32f16, VReg_512>;
-def : BitConvert <v8f64, v32i16, VReg_512>;
-def : BitConvert <v32f16, v8i64, VReg_512>;
-def : BitConvert <v32f16, v8f64, VReg_512>;
-def : BitConvert <v32i16, v8i64, VReg_512>;
-def : BitConvert <v32i16, v8f64, VReg_512>;
-
-
-def : BitConvert <v32bf16, v32i16, VReg_512>;
-def : BitConvert <v32i16, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v32i16, SReg_512>;
-def : BitConvert <v32i16, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v32f16, VReg_512>;
-def : BitConvert <v32f16, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v32f16, SReg_512>;
-def : BitConvert <v32f16, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v16i32, VReg_512>;
-def : BitConvert <v16i32, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v16i32, SReg_512>;
-def : BitConvert <v16i32, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v16f32, VReg_512>;
-def : BitConvert <v16f32, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v16f32, SReg_512>;
-def : BitConvert <v16f32, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v8f64, VReg_512>;
-def : BitConvert <v8f64, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v8f64, SReg_512>;
-def : BitConvert <v8f64, v32bf16, SReg_512>;
-
-def : BitConvert <v32bf16, v8i64, VReg_512>;
-def : BitConvert <v8i64, v32bf16, VReg_512>;
-def : BitConvert <v32bf16, v8i64, SReg_512>;
-def : BitConvert <v8i64, v32bf16, SReg_512>;
+foreach vt = VReg_512.RegTypes in {
+ foreach st = VReg_512.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_512>;
+ }
+ }
+}
+
+foreach vt = SReg_512.RegTypes in {
+ foreach st = SReg_512.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_512>;
+ }
+ }
+}
// 1024-bit bitcast
-def : BitConvert <v32i32, v32f32, VReg_1024>;
-def : BitConvert <v32f32, v32i32, VReg_1024>;
-def : BitConvert <v16i64, v16f64, VReg_1024>;
-def : BitConvert <v16f64, v16i64, VReg_1024>;
-def : BitConvert <v16i64, v32i32, VReg_1024>;
-def : BitConvert <v32i32, v16i64, VReg_1024>;
-def : BitConvert <v16f64, v32f32, VReg_1024>;
-def : BitConvert <v32f32, v16f64, VReg_1024>;
-def : BitConvert <v16i64, v32f32, VReg_1024>;
-def : BitConvert <v32i32, v16f64, VReg_1024>;
-def : BitConvert <v16f64, v32i32, VReg_1024>;
-def : BitConvert <v32f32, v16i64, VReg_1024>;
+foreach vt = VReg_1024.RegTypes in {
+ foreach st = VReg_1024.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, VReg_1024>;
+ }
+ }
+}
+
+foreach vt = SReg_1024.RegTypes in {
+ foreach st = SReg_1024.RegTypes in {
+ if !not(!eq (vt, st)) then {
+ def : BitConvert <vt, st, SReg_1024>;
+ }
+ }
+}
/********** =================== **********/
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
index 35c7b393a8ca4..6b54764a876ca 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
@@ -547,8 +547,12 @@ class RegisterTypes<list<ValueType> reg_types> {
}
def Reg16Types : RegisterTypes<[i16, f16, bf16]>;
-def Reg32Types : RegisterTypes<[i32, f32, v2i16, v2f16, v2bf16, p2, p3, p5, p6]>;
-def Reg64Types : RegisterTypes<[i64, f64, v2i32, v2f32, p0, p1, p4, v4i16, v4f16, v4bf16]>;
+def Reg32DataTypes: RegisterTypes<[i32, f32, v2i16, v2f16, v2bf16]>;
+def Reg32PtrTypes: RegisterTypes<[p2, p3, p5, p6]>;
+def Reg32Types : RegisterTypes<!listconcat(Reg32DataTypes.types, Reg32PtrTypes.types)>;
+def Reg64DataTypes: RegisterTypes<[i64, f64, v2i32, v2f32, v4i16, v4f16, v4bf16]>;
+def Reg64PtrTypes: RegisterTypes<[p0, p1, p4]>;
+def Reg64Types : RegisterTypes<!listconcat(Reg64DataTypes.types, Reg64PtrTypes.types)>;
def Reg96Types : RegisterTypes<[v3i32, v3f32]>;
def Reg128Types : RegisterTypes<[v4i32, v4f32, v2i64, v2f64, v8i16, v8f16, v8bf16]>;
@@ -940,8 +944,7 @@ multiclass VRegClass<int numRegs, list<ValueType> regTypes, dag regList> {
}
}
-defm VReg_64 : VRegClass<2, [i64, f64, v2i32, v2f32, v4f16, v4bf16, v4i16, p0, p1, p4],
- (add VGPR_64)>;
+defm VReg_64 : VRegClass<2, Reg64Types.types, (add VGPR_64)>;
defm VReg_96 : VRegClass<3, Reg96Types.types, (add VGPR_96)>;
defm VReg_128 : VRegClass<4, Reg128Types.types, (add VGPR_128)>;
defm VReg_160 : VRegClass<5, [v5i32, v5f32], (add VGPR_160)>;
diff --git a/llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll b/llv...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be no test changes. Splitting the tests by type should be done separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And should still drop the test changes, and leave this as an NFC refactoring only
This PR add test cases for all types of bit conversation, it prepares for PR: #132899 All tests passed due to: 1. For DAG, pattern will not separate SReg and VReg. One of the sample is: ``` define <2 x double> @v_bitcast_v4f32_to_v2f64(<4 x float> inreg %a, i32 %b) { %cmp = icmp eq i32 %b, 0 br i1 %cmp, label %cmp.true, label %cmp.false cmp.true: %a1 = fadd <4 x float> %a, splat (float 1.000000e+00) %a2 = bitcast <4 x float> %a1 to <2 x double> br label %end cmp.false: %a3 = bitcast <4 x float> %a to <2 x double> br label %end end: %phi = phi <2 x double> [ %a2, %cmp.true ], [ %a3, %cmp.false ] ret <2 x double> %phi } ``` It suppose to select from scalar register patterns. But the Vreg pattern is matched is as follow: ``` Debug log: ISEL: Starting selection on root node: t3: v2f64 = bitcast t2 ISEL: Starting pattern match Initial Opcode index to 440336 Skipped scope entry (due to false predicate) at index 440339, continuing at 440367 Skipped scope entry (due to false predicate) at index 440368, continuing at 440396 Skipped scope entry (due to false predicate) at index 440397, continuing at 440435 Skipped scope entry (due to false predicate) at index 440436, continuing at 440467 Skipped scope entry (due to false predicate) at index 440468, continuing at 440499 Skipped scope entry (due to false predicate) at index 440500, continuing at 440552 Skipped scope entry (due to false predicate) at index 440553, continuing at 440587 Skipped scope entry (due to false predicate) at index 440588, continuing at 440622 Skipped scope entry (due to false predicate) at index 440623, continuing at 440657 Skipped scope entry (due to false predicate) at index 440658, continuing at 440692 Skipped scope entry (due to false predicate) at index 440693, continuing at 440727 Skipped scope entry (due to false predicate) at index 440728, continuing at 440769 Skipped scope entry (due to false predicate) at index 440770, continuing at 440798 Skipped scope entry (due to false predicate) at index 440799, continuing at 440836 Skipped scope entry (due to false predicate) at index 440837, continuing at 440870 TypeSwitch[v2f64] from 440873 to 440892 Patterns: /*440892*/ OPC_CompleteMatch, 1, 0, // Src: (bitconvert:{ *:[v2f64] } VReg_128:{ *:[v4f32] }:$src0) - Complexity = 3 // Dst: VReg_128:{ *:[v2f64] }:$src0 ``` 2. Global isel will use `Select_COPY` to select bitcast
…052) This PR add test cases for all types of bit conversation, it prepares for PR: llvm/llvm-project#132899 All tests passed due to: 1. For DAG, pattern will not separate SReg and VReg. One of the sample is: ``` define <2 x double> @v_bitcast_v4f32_to_v2f64(<4 x float> inreg %a, i32 %b) { %cmp = icmp eq i32 %b, 0 br i1 %cmp, label %cmp.true, label %cmp.false cmp.true: %a1 = fadd <4 x float> %a, splat (float 1.000000e+00) %a2 = bitcast <4 x float> %a1 to <2 x double> br label %end cmp.false: %a3 = bitcast <4 x float> %a to <2 x double> br label %end end: %phi = phi <2 x double> [ %a2, %cmp.true ], [ %a3, %cmp.false ] ret <2 x double> %phi } ``` It suppose to select from scalar register patterns. But the Vreg pattern is matched is as follow: ``` Debug log: ISEL: Starting selection on root node: t3: v2f64 = bitcast t2 ISEL: Starting pattern match Initial Opcode index to 440336 Skipped scope entry (due to false predicate) at index 440339, continuing at 440367 Skipped scope entry (due to false predicate) at index 440368, continuing at 440396 Skipped scope entry (due to false predicate) at index 440397, continuing at 440435 Skipped scope entry (due to false predicate) at index 440436, continuing at 440467 Skipped scope entry (due to false predicate) at index 440468, continuing at 440499 Skipped scope entry (due to false predicate) at index 440500, continuing at 440552 Skipped scope entry (due to false predicate) at index 440553, continuing at 440587 Skipped scope entry (due to false predicate) at index 440588, continuing at 440622 Skipped scope entry (due to false predicate) at index 440623, continuing at 440657 Skipped scope entry (due to false predicate) at index 440658, continuing at 440692 Skipped scope entry (due to false predicate) at index 440693, continuing at 440727 Skipped scope entry (due to false predicate) at index 440728, continuing at 440769 Skipped scope entry (due to false predicate) at index 440770, continuing at 440798 Skipped scope entry (due to false predicate) at index 440799, continuing at 440836 Skipped scope entry (due to false predicate) at index 440837, continuing at 440870 TypeSwitch[v2f64] from 440873 to 440892 Patterns: /*440892*/ OPC_CompleteMatch, 1, 0, // Src: (bitconvert:{ *:[v2f64] } VReg_128:{ *:[v4f32] }:$src0) - Complexity = 3 // Dst: VReg_128:{ *:[v2f64] }:$src0 ``` 2. Global isel will use `Select_COPY` to select bitcast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch should still have no test changes in it
using loop to define data type convert patterns